The airway microbiota of neonates colonized with asthma-associated pathogenic bacteria

Culture techniques have associated colonization with pathogenic bacteria in the airways of neonates with later risk of childhood asthma, whereas more recent studies utilizing sequencing techniques have shown the same phenomenon with specific anaerobic taxa. Here, we analyze nasopharyngeal swabs from 1 month neonates in the COPSAC2000 prospective birth cohort by 16S rRNA gene sequencing of the V3-V4 region in relation to asthma risk throughout childhood. Results are compared with previous culture results from hypopharyngeal aspirates from the same cohort and with hypopharyngeal sequencing data from the later COPSAC2010 cohort. Nasopharyngeal relative abundance values of Streptococcus pneumoniae, Haemophilus influenzae, and Moraxella catarrhalis are associated with the same species in the hypopharyngeal cultures. A combined pathogen score of these bacteria’s abundance values is associated with persistent wheeze/asthma by age 7. No other taxa are associated. Compared to the hypopharyngeal aspirates from the COPSAC2010 cohort, the anaerobes Veillonella and Prevotella, which have previously been implicated in asthma development, are less commonly detected in the COPSAC2000 nasopharyngeal samples, but correlate with the pathogen score, hinting at latent community structures that bridge current and previous results. These findings have implications for future asthma prevention efforts.

The leading authors have previously published findings from COPSAC cohorts, which were established by the late Dr. Bisgaard, and are considered some of the most well-characterized pediatric cohorts.In this submission, Dr. Thorsen and coworkers profile bacterial communities using 16S RNA (V3-V4 region) from nasopharynx swabs of 1month old children from the COPSAC2000 cohort.They compared these communities to hypopharyngeal aspirate diagnostic culture results (targeting three respiratory pathogens H. influenzae, M. catarrhalis and S. pneumoniae) from 244 children from the same cohort (data previously published Bisgaard et al., NEJM 2007).They report a strong correlation between culture results and sequenced communities when it comes to the detection of the three pathogenic species.A high pathogen score derived from the relative abundance of the three pathogens was associated with asthma, much like what has been previously reported based on culture.Bacterial communities in this study showed low diversity of oral anaerobes Veillonella and Prevotella, previously found in hypopharyngeal aspirates from 1mo children (COPSAC2010 -published Thorsen et al, Nat Comm 2019) to be associated with asthmathese genera were not found to be associated with asthma in this study -the authors attribute it to likely being due to sampling different sites of the airway.

Major comments:
After reading this cleanly put-together manuscript, I'm left perplexed about what novelty it adds to the field of airway microbiome in pediatric asthma.Correlation between culture and sequencing of the three respiratory pathogens had been established in this field, and their association with asthma development later in life is well established.The study is inadequately designed to address bacterial community composition between the hypopharynx and nasopharynx; as such, the conclusions related to the differences between the airway compartments are overstated.My disappointment also extends to the manuscript's narrative solely focused on the COPSAC cohorts -the narrative would benefit from expanding the scope to what is known about airway microbiome colonization in asthma.

Specific comments:
1. Nasopharyngeal swabs used for this study were stored in SP4 mycoplasma transport medium [line 199]; the samples were kept at 4C until delivery to the laboratory; they were then used for cultivation of mycoplasma -which means they were likely exposed to ambient temperature and air for at least a few hours before being stored at -80C.Can you confirm that these conditions (specifically undergoing mycoplasma culture protocol) do not change bacterial profiles?Specifically in relation to anaerobic bacteria?2. The median sequencing depth was 60723 with IQR 45384-79958 [lines 311-314].Was the data normalized to one depth?Or did it undergo another form of normalization? 3. S. aureus is shown in Figure 2 with what looks like a good correlation to culture results, but there is no reference to it in the text -is there a reason why it was not included in the pathogen score?It has been linked to asthma outcomes in children, was there no association with S.aureus and any risk estimates?It at least deserves an explanation on why it was omitted from the pathogen score calculation (other than it was not included in Bisgaard et al 2007-but more recent evidence implicates it in asthma genesis).How do the results change if you include its abundance in an updated pathogen score? 4. Positive pathogen culture was a stronger predictor of asthma than detection of these by sequencing.Is that because sequencing is more sensitive than culture, which requires a higher relative abundance of these organisms in a sample?Have you looked at the distribution of samples by dominant organisms (not presence or absence but dominance of pathogens vs commensals)? 5. Shannon diversity was not associated with asthma [line 375] by age 7. What about phylogenetic diversity?6. Overall microbial composition was not significantly different between children who developed asthma by age 7 and those who did not by weighted .What about unweighted UniFrac? 7. Was bacterial diversity associated with the pathogen score?Or the relative abundance of specific pathogens or commensals? 8. "Veillonella and Prevotella are less common in the nasopharynx than hypopharynx, and do not associate with asthma" [lines 396-397].This conclusion is an overstatement and may be misleading since the supporting data is derived from separate cohorts, where samples were not handled the same (COPSAC2000-samples are much older and were used for culture of mycoplasma before storage for sequencing); the sampling approach was different (aspirate vs swab).All these factors are known to affect the composition of the airway microbiome.Addressing similarities rather than differences between these two cohorts may be more convincing.
Reviewer #2 (Remarks to the Author): This paper reports a novel analysis of airway microbiota in nasopharyngeal swabs of the same children in whom previously positive associations of airway pathogens cultured from eraly life hypopharyngeal aspirates with school-age asthma were found.The original findings were not replicated but a summation score of relative abundance values for 16S rRNA assessment of the previously associated pathogens was related to subsequent asthma.
The question arises why the previously described associations cannot be replicated.It remains unclear whether this is due to a lack of statistical power, selection by culturing, sampling location, or different 16S rRNA regions amplified.The authors might try to disentangle the different explanations.As it is now, too many parameters vary and do not allow for a sensible comparison and sound conclusions.
The main conclusion that "the nasopharyngeal microbiota, assessed using 16S sequencing, was strongly correlated with culturing of pathogenic airway bacteria from hypopharyngeal aspirates" is actually challenged by the quite modest AUROC values.Rather the discrepancy suggests that the cultures revealed a particular aspect that is "ignored" by 16s rRNA sequencing.This notion is supported the statement in lines 349-352: "The association disappeared when repeating the analysis only among the children with no pathogenic bacteria detected by cultures (HR 0.93 [0.60;1.46],p=0.44,n=188)or in all children when adjusting for the presence of pathogenic bacteria (HR 1.06 [0.73;1.52],p=0.76)".So culturing might reveal a distinct characteristic that is missed by direct amplification and sequencing.
The usage of a pathogen score is counterintuitive in a paper on the specificity of bacterial colonization.Moreover, it might be a proxy for host conditions rather than a summation of effects of pathogens.One may also perform sensitivity analyses leaving out one taxon at a time to quantify their relative contribution to the effect.
Figure 1 is very illustrative.However, one might also mention the risk profiles (family history) and the underlying intervention of COPSAC2010.Since COPSAC2010 is an intervention study, stratified or adjusted analyses are recommended.The selection of participants of observational studies and trials may vary.This should be considered when comparing the two different study types.
As suggested by Figures 1 and 4, COPSAC2010 is an essential part of this paper.So it should also be mentioned in the abstract.
Figure 2A is not very informative because AUROC is a global measure veiling potentially substantial differences between sensitivity and specificity.E.g. for H. influenzae sequenzing is highly specific but hardly sensitive as illustrated by Fig. S1.Moreover, the scale of AUROC is prone to misinterpretation due to its offset of 0.5.One may replace Fig. 2A by Fig. S1, as it is not affected by the mentioned shortcomings.
There are no inverse correlations of asthma with sequenced taxa reported.They might be informative for protective taxa previously missed by culturing techniques.
There might be a cohort effect in the sense that lead pathogens might change over time.Is there any data available (e.g. from hospital microbiology records) that the spectrum of respiratory pathogens might have changed over the years?

REVIEWER COMMENTS
Reply: We thank the reviewers for the thorough evalua7on and construc7ve sugges7ons for improving the manuscript.
Reviewer #1 (Remarks to the Author): The leading authors have previously published findings from COPSAC cohorts, which were established by the late Dr. Bisgaard, and are considered some of the most well-characterized pediatric cohorts.In this submission, Dr. Thorsen and coworkers profile bacterial communi7es using 16S RNA (V3-V4 region) from nasopharynx swabs of 1month old children from the COPSAC2000 cohort.They compared these communi7es to hypopharyngeal aspirate diagnos7c culture results (targe7ng three respiratory pathogens H. influenzae, M. catarrhalis and S. pneumoniae) from 244 children from the same cohort (data previously published Bisgaard et al., NEJM 2007).They report a strong correla7on between culture results and sequenced communi7es when it comes to the detec7on of the three pathogenic species.A high pathogen score derived from the rela7ve abundance of the three pathogens was associated with asthma, much like what has been previously reported based on culture.Bacterial communi7es in this study showed low diversity of oral anaerobes Veillonella and Prevotella, previously found in hypopharyngeal aspirates from 1mo children (COPSAC2010published Thorsen et al, Nat Comm 2019) to be associated with asthma -these genera were not found to be associated with asthma in this study -the authors a_ribute it to likely being due to sampling different sites of the airway.

Major comments:
Comment 1 (C1): Aber reading this cleanly put-together manuscript, I'm leb perplexed about what novelty it adds to the field of airway microbiome in pediatric asthma.Correla7on between culture and sequencing of the three respiratory pathogens had been established in this field, and their associa7on with asthma development later in life is well established.The study is inadequately designed to address bacterial community composi7on between the hypopharynx and nasopharynx; as such, the conclusions related to the differences between the airway compartments are overstated.My disappointment also extends to the manuscript's narra7ve solely focused on the COPSAC cohorts -the narra7ve would benefit from expanding the scope to what is known about airway microbiome coloniza7on in asthma.Reply 1 (R1): We thank the reviewer for these spot-on comments and are grateful for the opportunity to remedy these shortcomings of the manuscript.The whole endeavor from our side -digging out these 20-years old samples hidden in our freezer and re-examining them using sequencing techniques, was indeed an effort to learn more about the "dark ma_er" of these famous old samples.The main research ques7on was therefore whether addi7onal knowledge could be gained regarding the old culture-based asthma associa7ons.We conclude that the "dark ma_er" did not provide strong addi7onal predic7ve power, and our findings can in that light be considered nega7ve, but nevertheless important, as various newer studies have pointed to non-cul7vatable bacteria being important asthma-predictors.Our study was indeed not designed to compare hypopharynx and nasopharynx.While we addressed this limita7on already in the previous manuscript version, the wording in the Conclusion and elsewhere did not adequately reflect that it was never our primary inten7on.
Rather, we had the opportunity to analyze these samples and had to first establish that they indeed were comparable with the old culture data.We have revised the manuscript to soben the wording regarding these differences but would be grateful to the reviewer and editor for poin7ng out any such places that we overlooked.In addi7on, we have underscored that our study was not designed to test such differences in detec7on; in the discussion: "… However, other poten/al reasons include the numerous differences in sample collec/on, handling and storage -our study was not designed to infer the cause of such differences in observed abundances, but rather proved an opportunity to a>empt to reconcile the apparent differences in asthma associa/ons between the cohorts."Furthermore, we agree that the manuscript could benefit from "zooming out" to include other cohorts' findings and to point forward to the challenges the field will be tackling in the coming years.This is now included as a substan7al addi7on to the Discussion.Specific comments: C2: Nasopharyngeal swabs used for this study were stored in SP4 mycoplasma transport medium [line 199]; the samples were kept at 4C un7l delivery to the laboratory; they were then used for cul7va7on of mycoplasma -which means they were likely exposed to ambient temperature and air for at least a few hours before being stored at -80C.Can you confirm that these condi7ons (specifically undergoing mycoplasma culture protocol) do not change bacterial profiles?Specifically in rela7on to anaerobic bacteria?R2: We thank the reviewer for highligh7ng this, and we agree with this sen7ment -it cannot be ruled out, and may even be plausible, that this handling protocol, which differed from the one applied in the COPSAC2010 cohort, could contribute to differences between the two cohorts, and specifically in the reduc7on of detec7on of Veillonella and Prevotella, among other anaerobic taxa.We have now described this limita7on in the Discussion sec7on: "Similarly, since the samples underwent culture for mycobacteria before freezing to -80°C, which could contribute to degrada/on of anaerobic bacteria like Veillonella and Prevotella.Notably, such skewness, like other technical biases, would equally affect all samples and thus not interfere with associa/ons with clinical phenotypes.However, they could interfere with the comparisons made with other cohorts, including COPSAC2010."And in the interpreta7on regarding Veillonella and Prevotella, "Another reason could be that the samples were not immediately frozen aQer collec/on."C3: The median sequencing depth was 60723 with IQR 45384-79958 [lines 311-314].Was the data normalized to one depth?Or did it undergo another form of normaliza7on?R3: The data was not normalized to a single sequencing depth, but rather normalized in different ways as appropriate depending on the analysis.For most analyses, rela7ve abundance normaliza7on (Total sum scaling) was used, but e.g. in the differen7al abundance analysis, we used rela7ve abundance normaliza7on for the Cox regression and DESeq2's standard quan7le-type normaliza7on for the DESeq2 analysis, which could contribute to part of the observed differences between the two methods.Where appropriate, we also included library size as a technical covariate used to adjust our analyses.C4: S. aureus is shown in Figure 2 with what looks like a good correla7on to culture results, but there is no reference to it in the text -is there a reason why it was not included in the pathogen score?It has been linked to asthma outcomes in children, was there no associa7on with S.aureus and any risk es7mates?It at least deserves an explana7on on why it was omi_ed from the pathogen score calcula7on (other than it was not included in Bisgaard et al 2007-but more recent evidence implicates it in asthma genesis).How do the results change if you include its abundance in an updated pathogen score?R4: We did indeed base our analysis priori7es on first recapitula7ng the culture-based findings from COPSAC2000 (where S aureus was cultured, but not associated with later asthma), with the new sequencing data before expanding the view to other taxa.Among the taxa examined in the differen7al abundance analysis was also S aureus, however it was not significant in neither the Cox regression nor the DESeq2 analysis.Therefore, including it in the pathogen score alongside the other three does not add anything to the analysis, but instead interferes with the signal from the three a priori chosen species (crude HR 1.25 [0.98;1.70],p=0.15).Please also refer to R8 where we show that S. aureus was actually inversely correlated with the pathogen score.
C5. Posi7ve pathogen culture was a stronger predictor of asthma than detec7on of these by sequencing.Is that because sequencing is more sensi7ve than culture, which requires a higher rela7ve abundance of these organisms in a sample?Have you looked at the distribu7on of samples by dominant organisms (not presence or absence but dominance of pathogens vs commensals)?R5: We thank the reviewer for this sugges7on of a dominance-based analysis.We have prepared a figure for the reviewer showing the distribu7on of the pathogen score before log transforma7on -in a rela7ve abundance space.The ver7cal line represents the cutoff for pathogen dominance at 50%.As can be seen on the figure, the group of pathogendominated samples only comprise n=14.Despite this small group size, the signal towards asthma persists; crude HR 1.33 [1.12;1.58],p=0.0014.While the analysis certainly is of interest, the small group size ul7mately limits its usefulness.
C6. Shannon diversity was not associated with asthma [line 375] by age 7. What about phylogene7c diversity?R6: We thank the reviewer for this sugges7on and have now included Faith's Phylogene7c Diversity to the analysis, since it provides addi7onal and complementary informa7on to the exis7ng alpha diversity analysis with the Shannon index.However, there was no significant associa7on with asthma.The results have been updated: "We quan/fied the diversity of the microbiome using the Shannon diversity index and Faith's phylogene/c diversity, neither of which was associated with asthma (Cox regression,Shannon: HR 0.72 [0.42;1.22],p=0.22;Faith: HR 1.02 [0.88;1.18],p=0.78;n=285)."C7.Overall microbial composi7on was not significantly different between children who developed asthma by age 7 and those who did not by weighted .What about unweighted UniFrac?R7: We agree with the reviewer it could be interes7ng to include this complementary distance metric and similar to the previous ques7on on alpha diversity, we have expanded the results to include a comparison using the unweighted UniFrac metric."We compared the overall microbiome composi/on of children who developed asthma by age 7 with those that did not, which were similar (PERMANOVA,unweighted UniFrac,F=1.21,p=0.17;log weighted UniFrac,F=1.17,p=0.25;n=221)." C8. Was bacterial diversity associated with the pathogen score?Or the rela7ve abundance of specific pathogens or commensals?R8: We analyzed the Shannon diversity index and Faith's phylogene7c diversity, in line with the update in R6.We found that indeed, as suggested by the reviewer, both metrics were associated with the pathogen score (see plots including results from linear models and Spearman's correla7on tests below).Furthermore, we analyzed these diversity metrics in rela7on to the same subset of taxa above the cutoff for the differen7al abundance analysis (Fig. 3).We found that most taxa were posi7vely associated with both Shannon and Faith diversity metrics, this was not par7cularly different between our three pathogens of interest (marked with color) and other taxa.In par7cular, diversity was inversely associated with Staphylococcus aureus, which makes a lot of sense, since this is by far the taxon with highest rela7ve abundance, thereby numerically "suppressing" the rela7ve abundance of other taxa, which will also "suppress" diversity metrics.In the lollipop plots below, one star is p < 0.05, two stars q < 0.05.Finally, to wrap up this interes7ng analysis, we compared the pathogen score with each taxon (which is related to the Veillonella/Prevotella-focused figure S5).Here, we found that, unsurprisingly, the three pathogens on which it was based exhibited the strongest correla7ons.However, the 4 th and 5 th strongest were seen from two Veillonella species, hin7ng at a poten7al shared subcommunity structure as seen in Figure S5.Note that Prevotella was too rare to be above the threshold for this analysis.Again, this was in opposite direc7on with Staphylococcus spp.To summarize, while the pathogen score is indeed associated with diversity, the two metrics express different traits in terms of taxa composi7on, where the pathogen score is much more specific for the three pathogens and their related co-colonizing taxa.C9. "Veillonella and Prevotella are less common in the nasopharynx than hypopharynx, and do not associate with asthma" [lines 396-397].This conclusion is an overstatement and may be misleading since the suppor7ng data is derived from separate cohorts, where samples were not handled the same (COPSAC2000-samples are much older and were used for culture of mycoplasma before storage for sequencing); the sampling approach was different (aspirate vs swab).All these factors are known to affect the composi7on of the airway microbiome.Addressing similari7es rather than differences between these two cohorts may be more convincing.R9: As noted in R1 and R2, we completely agree with this assessment and have adjusted the wording accordingly throughout the manuscript.Our study was not designed to compare nasopharynx and hypopharynx directly, rather; we had to use the samples/data at our disposal with the inherent limita7ons stemming from that.Please refer to R1 and R2 for related answers.
Reviewer #2 (Remarks to the Author): This paper reports a novel analysis of airway microbiota in nasopharyngeal swabs of the same children in whom previously posi7ve associa7ons of airway pathogens cultured from eraly life hypopharyngeal aspirates with school-age asthma were found.The original findings were not replicated but a summa7on score of rela7ve abundance values for 16S rRNA assessment of the previously associated pathogens was related to subsequent asthma.C10: The ques7on arises why the previously described associa7ons cannot be replicated.It remains unclear whether this is due to a lack of sta7s7cal power, selec7on by culturing, sampling loca7on, or different 16S rRNA regions amplified.The authors might try to disentangle the different explana7ons.As it is now, too many parameters vary and do not allow for a sensible comparison and sound conclusions.R10: As outlined in figure 1, there are several previous findings that we build on in this figure.The first one is the culture results from COPSAC2000 and asthma.We conclude that these findings were recapitulated with the sequencing data, but we do not consider this a replica7on, since the cohort is the same.The second result is the findings from COPSAC2010 based on sequencing data concerning Veillonella, Prevotella, and asthma.This finding we were not able to directly replicate, seemingly due to low detec7on in the COPSAC2000 samples.Here, we have now expanded the descrip7on of the differences that may contribute to the reduced detec7on of these taxa, please refer to R1, R2, and R9 for further details.
C11: The main conclusion that "the nasopharyngeal microbiota, assessed using 16S sequencing, was strongly correlated with culturing of pathogenic airway bacteria from hypopharyngeal aspirates" is actually challenged by the quite modest AUROC values.Rather the discrepancy suggests that the cultures revealed a par7cular aspect that is "ignored" by 16s rRNA sequencing.This no7on is supported the statement in lines 349-352: "The associa7on disappeared when repea7ng the analysis only among the children with no pathogenic bacteria detected by cultures (HR 0.93 [0.60;1.46],p=0.44,n=188)or in all children when adjus7ng for the presence of pathogenic bacteria (HR 1.06 [0.73;1.52],p=0.76)".So culturing might reveal a dis7nct characteris7c that is missed by direct amplifica7on and sequencing.R11: We mostly agree with this comment from the reviewer, especially the interes7ng phenomenon that it seems that the culture data capture something unique which is not fully covered in the sequencing data.As the reviewer also men7ons in C10, pertaining to the comparison of nasopharyngeal sequencing and hypopharyngeal culture, we acknowledge the limita7on that at least 2 important factors vary together: sampling loca7on and detec7on technique, which precludes us from concluding which is more important.When taking this into account, we however think that the associa7ons are indeed quite strong, since both culturing and 16S sequencing are imperfect methods, which adds noise to the system, we would not expect AUC values anywhere close to 1.While not directly comparable, in a previous study we also found imperfect agreement of culturing and sequencing of both airway and fecal samples (Gupta et al, Communica7ons Biology 2019, doi 10.1038/s42003-019-0540-1) -of note, this was a comparison with both techniques done on the same sample set.So, it is important to interpret these numeric es7mates in light of what can realis7cally be expected.In this paper, the main mo7va7on for the analysis is to show that these detec7on methods show agreement before it is reasonable to move on to the main point of the paper, which is the associa7on with asthma.
C12: The usage of a pathogen score is counterintui7ve in a paper on the specificity of bacterial coloniza7on.Moreover, it might be a proxy for host condi7ons rather than a summa7on of effects of pathogens.One may also perform sensi7vity analyses leaving out one taxon at a 7me to quan7fy their rela7ve contribu7on to the effect.R12: We agree with the reviewer that this coloniza7on pa_ern may very well be a hostassociated factor.As men7oned in the limita7ons, we cannot infer any direc7onality or causality from these observa7onal data.This applies regardless of choice of detec7on method (culturing vs sequencing) and any data analysis approaches which can be applied.Based on this comment from the reviewer, which is par7ally echoed in C1, we have found it prudent to elaborate on this aspect in the discussion: "What is not clear from these results is whether specific bacterial taxa, bacterial func/ons, or host characteris/cs such as mucosal immune responses or another latent suscep/bility are key in forming this associa/on.We lack a causal and mechanis/c understanding which is sorely needed before research in this field can progress to the next step.The most crucial ques/on here is whether the bacteria are "to blame" for this associa/on, by e.g.ini/a/ng a trajectory of chronic inflamma/on, in which case one could envision targeted manipula/on of the early-life airway microbiota as a future means of preven/ng or trea/ng asthma, or whether the bacteria are differen/ally discovered in children at high risk for asthma due to inherent latent suscep/bili/es already present in early life."As suggested by the reviewer, we have performed a sensi7vity analysis leaving out each of the three pathogenic species in turn, see below.We conclude that no single species seems to be driving the associa7on, which is mirrored by the individual species associa7ons with asthma presented in Figure 3, in which all three species show the same direc7onality of associa7on.
C13: Figure 1 is very illustra7ve.However, one might also men7on the risk profiles (family history) and the underlying interven7on of COPSAC2010.Since COPSAC2010 is an interven7on study, stra7fied or adjusted analyses are recommended.The selec7on of par7cipants of observa7onal studies and trials may vary.This should be considered when comparing the two different study types.C13: We thank the reviewer for these excellent sugges7ons and have updated figure 1 to include this informa7on.Since the novel analysis in the current manuscript is performed in the older COPSAC2000 cohort, there were no pregnancy interven7ons.In addi7on to the updated fig 1, we highlighted the difference in selec7on criteria between the two COPSAC cohorts in the discussion: "First, COPSAC2000 is a high-risk cohort where all the mothers had a history of doctordiagnosed asthma 14 .In contrast, the COPSAC2010 cohort was unselected by design and had a rate of 30% self-reported asthma, which was higher than those who were invited but declined to par/cipate 58 ."C14: As suggested by Figures 1 and 4, COPSAC2010 is an essen7al part of this paper.So it should also be men7oned in the abstract.R14: We agree and have now included COPSAC2010 in the abstract.Note the abstract is now completely reforma_ed to comply with the journal style.
C15: Figure 2A is not very informa7ve because AUROC is a global measure veiling poten7ally substan7al differences between sensi7vity and specificity.E.g. for H. influenzae sequenzing is highly specific but hardly sensi7ve as illustrated by Fig. S1.Moreover, the scale of AUROC is prone to misinterpreta7on due to its offset of 0.5.One may replace Fig. 2A by Fig. S1, as it is not affected by the men7oned shortcomings.C15: We agree with many of these technical assessments on the interpreta7on of ROC curves presented here.We considered the suggested changes to the figures and have decided to change the figures, but in a slightly different way.We have removed the AUC values printed to avoid such misinterpreta7on, and instead wri_en the median + interquar7le range of rela7ve abundances in each group in the text.The reasoning behind this, which is related to R1, R2, and R9, is that the aim of the analysis is not to infer differences between the nasopharyngeal and hypopharyngeal microbiome.Rather, it is a necessary first step to show that despite the difference in sampling, we are s7ll picking up a closely related signal, which is a prerequisite for the main objec7ve -studying the associa7on between bacterial coloniza7on and asthma.We would therefore like to focus less on the "predic7on" aspect -we here instead show that these samples represent dis7nct but related niches.The more technical considera7ons of sensi7vity, specificity and AUC values are, in our opinion, be_er relegated to a supplemental figure, as to not throw off the reader before reaching the main objec7ves of the study.
C16: There are no inverse correla7ons of asthma with sequenced taxa reported.They might be informa7ve for protec7ve taxa previously missed by culturing techniques.C16: We agree with the reviewer that any such protec7ve taxa would be of immense interest.The way our analysis is set up, we would be able to iden7fy any such taxa if they displayed inverse associa7ons with asthma.However, none turned out significant in our analysis, similar to the results from the COPSAC2010 cohort, where we likewise only detected posi7ve associa7ons.This may be airway specific and is in contrast to findings from the early life gut microbiota, where many taxa seem protec7ve of later asthma development.
C17: There might be a cohort effect in the sense that lead pathogens might change over 7me.Is there any data available (e.g. from hospital microbiology records) that the spectrum of respiratory pathogens might have changed over the years?R17: First of all, we agree with this assessment from the reviewer.The reviewer is indeed correct that there is an extensive surveillance of pathogens in the Danish hospital system.There are public reports such as the DANMAP (h_ps://www.danmap.org/-/media/sites/danmap/downloads/reports/2021/danmap_2021_version-1.pdf) and an online dashboard (h_ps://sta7s7k.ssi.dk/), which unfortunately are not op7mal for this ques7on, since they focus mostly on an7microbial resistance and surveillance of invasive bacterial infec7ons, such as pneumococcal meningi7s.The internal hospital microbiology records system, MiBa, is not open to research currently pending some unsolved legal barriers (h_ps://miba.ssi.dk/forskningsbetjening),except data extracted for the official surveillance reports.However, the point s7ll stands and may even extend to changes in strains/serotypes of pathogens, which is probably even more fine-grained than what the system captures.We have briefly touched on this in the discussion, where we speculate that eg.pneumococcal strains may have changed in response to the infant pneumococcal vaccine, which differed between the two cohorts.However, in the absence of more concrete data, it is limited what we can conclude apart from posing this hypothesis.We have now added a line that similar changes may have occurred over 7me for other taxa."Fourth, with regard to S. pneumoniae, Denmark introduced the pneumococcal vaccina/on as part of the childhood vaccina/on programme in 2007 59 .While the infants in the COPSAC2010 study were not yet vaccinated themselves at the /me of sampling, the high child vaccina/on rates in the popula/on (>95% in Denmark 60 ) can influence the S. pneumoniae strains in the community 61 and poten/ally affect associa/ons with asthma.Similar changes may have occurred over /me for other taxa." Thank you for the opportunity to review a revised manuscript entitled "The airway microbiota of neonates colonized with asthma-associated pathogenic bacteria."by Dr Thorsen and colleagues.
I appreciate the effort put into revising the earlier version of the manuscript; however, my major criticisms of this work were dismissed; therefore, my original assessment still stands.This manuscript does not provide any novel findings which advance the field of airway microbiome in pediatric asthma for the following reasons: • The primary finding of a correlation between microbial culture and detection of three wellestablished asthma genic respiratory pathogens by sequencing has been previously published and is an accepted fact in the community.Here is a more recent example: Toivonen et al., Pediatrics (2020) https://doi.org/10.1542/peds.2020-0421 • The study design is flawed.Since the swab samples from COPSAC2000 sequenced in this study were used for Mycoplasma culture before storage and subsequent sequencing in this studytherefore are not representative of the nasopharynx at the time of collection.Clinical microbiome investigations are moving away from sequencing samples from "cohorts of convenience" for the sake of reporting sequenced data (once a novelty) to well-thought-out clinical studies not restricted by inadequately collected/handled or processed samples, which significantly undermines the current study's findings.
• The study was inadequately designed to examine differences in bacterial composition in distinct compartments of the airway; as such the comparison to COPSAC2010 Hypopharynx aspirate is of limited value and distracts from the main objective of the study (as defined by the authors themselves).
• Although the authors addressed some of my previous concerns in their direct responses, they did not attempt to restructure the analysis or the paper structure to uncover a novel clinically relevant observation that moves the field forward, which was a major criticism of the original submission.
Reviewer #2 (Remarks to the Author): I thank the authors for their open responses.The main message of the paper is much clearer now: the most important asthma pathogens (Haemophilus influenzae, Streptococcus pneumoniae, and Moraxella catarrhalis) are found in different studies by different methods at different sampling locations.This is reassuring but the novelty is somewhat limited given other publications on this topic since the legendary NEJM 2007 paper.
In addition, this finding does not answer the question on discrepant findings as stated in the introduction: "it has remained unclear whether the differences in asthma-associated bacteria between these studies were due to differences in sampling, the applied detection technique (culturing vs sequencing), or simply due to inherent cohort differences." Obviously, the study design was not set up to answer this question and it would be unfair to rate studies performed 10 or 20 years ago against the current state of the art.However, as the mentioned question is a central element of this paper it should receive a more definitive answer.
For the two COPSAC studies, the current findings reconcile previous discrepancies with respect to the 2007 pathogens but the prominent role of Veillonella and Prevotella in COPSAC2010 remains enigmatic.The current paper should give a more definitive answer to this question.Was this due to an artefact?Are said taxa bystanders of the 2007 pathogens?This is suggested by the correlations reported at the end of the current results section but an adjustment for the correlation in the previous associations with asthma is missing.

Minor comments:
Line 154 ("while the pathogen colonization was associated with hospitalization for wheeze"): This statement should be amended by an odds ratio.
For M. lincolnii to be a new candidate since the NEJM 2007 publication one should clearly state that the new candidate is not just a product of an updated version of the taxonomic assignment.The wording in lines 173-5 is probably not clear enough.
Lines 218/9: I would also mention that adjustment for confounders did not produce results substantially different from the raw estimates since adjustment for many measured confounders can amplify residual confounder bias.
Line 304: "nuances" seems inappropriate given the substantial differences just listed.
Unfortunately the main document contains the main text twice but not the supplementary figures.From the responses I understand that there were no major changes in the supplement.Therefore I would not insist on assessing the supplement.

REVIEWER COMMENTS
Reviewer #1 (Remarks to the Author): Comment 1: Thank you for the opportunity to review a revised manuscript en>tled "The airway microbiota of neonates colonized with asthma-associated pathogenic bacteria."by Dr Thorsen and colleagues.I appreciate the effort put into revising the earlier version of the manuscript; however, my major cri>cisms of this work were dismissed; therefore, my original assessment s>ll stands.This manuscript does not provide any novel findings which advance the field of airway microbiome in pediatric asthma for the following reasons: The primary finding of a correla>on between microbial culture and detec>on of three wellestablished asthma genic respiratory pathogens by sequencing has been previously published and is an accepted fact in the community.Here is a more recent example: Toivonen et al., Pediatrics (2020) hSps://doi.org/10.1542/peds.2020-0421R1: Thank you for once again evalua>ng our manuscript.The major concerns raised by the reviewer in the previous review were definitely not dismissed, but rather led to a large revision and a more precise and strengthened manuscript, for which we are grateful.As noted in the previous response leSer, the comparison of culture and sequencing is not the primary finding of our study but only included as an intermediate result -in essence a prerequisite for the primary objec>ve: To recapitulate the pathogen-asthma associa>on using sequencing data and to look for poten>al novel taxa contribu>ng to this phenomenon.Before we could examine the associa>on with asthma, we first needed to establish that the two methods (culture and sequencing) were in agreement, also due to the differences in sampling.We acknowledge that this misunderstanding about the study's focus is likely due to imprecise wording in the introduc>on, which we have now amended, see R3.The interes>ng study men>oned by the reviewer is cited in our manuscript, due to its findings of an associa>on between longitudinal microbiota profiles and asthma (Ref 37).
C2: The study design is flawed.Since the swab samples from COPSAC2000 sequenced in this study were used for Mycoplasma culture before storage and subsequent sequencing in this study -therefore are not representa>ve of the nasopharynx at the >me of collec>on.Clinical microbiome inves>ga>ons are moving away from sequencing samples from "cohorts of convenience" for the sake of repor>ng sequenced data (once a novelty) to well-thought-out clinical studies not restricted by inadequately collected/handled or processed samples, which significantly undermines the current study's findings.R2: This comment is a con>nua>on of the above misunderstanding of the aim of the study, which was not to compare the two microbiological compartments/techniques but to reassess and expand the associa>on with asthma development using a culture independent technique.While the swab type and handling might bias our ability to claim a precise microbiota composi>on, it was not biased towards the asthma endpoint.The strong recapitula>on of the associa>on between pathogenic bacteria and asthma shows that the swabs were indeed s>ll useful despite any delays before freezing and the 20 years of storage they were subjected to.See also below.

C3:
The study was inadequately designed to examine differences in bacterial composi>on in dis>nct compartments of the airway; as such the comparison to COPSAC2010 Hypopharynx aspirate is of limited value and distracts from the main objec>ve of the study (as defined by the authors themselves).R3: Please refer to responses above.We assume the reviewer is referring to this sec>on, from the Introduc>on: "Our aim was to use this data to examine whether the pathogen culture results could be reestablished using sequencing data from these anatomically dis;nct but adjacently collected samples.Further, we examine whether any other taxa associa;ng with asthma could be iden;fied using the more sensi;ve sequencing method -in par;cular a possible replica;on of the COPSAC2010 Veillonella and Prevotella associa;ons."We agree that this could be misunderstood as if the aim was to compare the different airway compartments.We are grateful for the chance to make this clearer, in par>cular what the word "results" in the first sentence refers to.We have now changed this in the hope that such misunderstanding can be avoided for new readers: "Our aim was to use this data to examine whether the pathogen culture associa;on with asthma could be re-established using sequencing data from these anatomically dis;nct but adjacently collected samples." We have also prefaced the relevant sec>on of the Results: "First, we compared the sequenced nasopharyngeal swabs to the hypopharyngeal aspirate culture results originally reported to be associated with asthma by age 5 5 , in order to ensure that there was agreement between these two sample sets and methods before comparing their associa;ons with asthma."And finally, before progressing to the asthma associa>ons in the Results, "Having established that these two samples were to an extent comparable, we progressed to our main aim of inves;ga;ng if the associa;on between neonatal pathogen coloniza;on with S. pneumoniae, H. influenzae, and M. catarrhalis and asthma 5 could be recapitulated with the sequencing data […]" C4: Although the authors addressed some of my previous concerns in their direct responses, they did not aSempt to restructure the analysis or the paper structure to uncover a novel clinically relevant observa>on that moves the field forward, which was a major cri>cism of the original submission.R4: We indeed amended and expanded the manuscript in mul>ple ways based on per>nent comments from the reviewers in the previous responses.Keeping in mind our focus of gaining a beSer understanding of the associa>on between neonatal bacterial coloniza>on and asthma, we believe that our results here are indeed highly clinically relevantestablishing that these pathogenic species remain associated with asthma despite differences in sampling loca>on, handling, and detec>on method.Furthermore, we were able to correlate these pathogens with Veillonella and Prevotella, which have been further analyzed and corroborated on in the current revision (please refer to R7).Our ul>mate goal is to translate an understanding of this phenomenon into new avenues for asthma preven>on in early life.
Reviewer #2 (Remarks to the Author): C5: I thank the authors for their open responses.The main message of the paper is much clearer now: the most important asthma pathogens (Haemophilus influenzae, Streptococcus pneumoniae, and Moraxella catarrhalis) are found in different studies by different methods at different sampling loca>ons.This is reassuring but the novelty is somewhat limited given other publica>ons on this topic since the legendary NEJM 2007 paper.R5: We thank the reviewer for this assessment.We agree that the novelty in the study does not lie in simply showing an associa>on between pathogen coloniza>on and asthma.Rather, it lies in our finding that this phenomenon seems to be sufficiently described by these three pathogens alone.Surprisingly, we found no other taxa to be individually significant, as we might have expected based on newer sequencing-based findings in COPSAC2010 and other cohorts.Furthermore, it seems that the presence of these three pathogens could be part of the same latent paSern of Veillonella and Prevotella by virtue of their posi>ve correla>on in our data.See also R7.C6: In addi>on, this finding does not answer the ques>on on discrepant findings as stated in the introduc>on: "it has remained unclear whether the differences in asthma-associated bacteria between these studies were due to differences in sampling, the applied detec>on technique (culturing vs sequencing), or simply due to inherent cohort differences."Obviously, the study design was not set up to answer this ques>on and it would be unfair to rate studies performed 10 or 20 years ago against the current state of the art.However, as the men>oned ques>on is a central element of this paper it should receive a more defini>ve answer.R6: We agree that this could be concluded and conveyed in a clearer manner.While our data does not allow us to fully reconcile these differences, we can conclude that the pathogenasthma associa>on did not cri>cally depend on sampling loca>on nor detec>on technique.This underscores the lack of associa>on for the three pathogens in the 2010 cohort, and highlights that the observed differences must be due to inherent cohort differences, which we provide several examples for.This has now been clarified in the discussion: "We can establish based on these results that the associa;on between pathogenic bacteria and asthma in COPSAC2000 did not cri;cally depend on sampling loca;on and method nor detec;on method, which extends to the unresolved differences between the asthmaassociated bacteria between the two COPSAC cohort studies listed in the introduc;on -and points to inherent cohort characteris;cs as the most important factor behind these differences."C7: For the two COPSAC studies, the current findings reconcile previous discrepancies with respect to the 2007 pathogens but the prominent role of Veillonella and Prevotella in COPSAC2010 remains enigma>c.The current paper should give a more defini>ve answer to this ques>on.Was this due to an artefact?Are said taxa bystanders of the 2007 pathogens?This is suggested by the correla>ons reported at the end of the current results sec>on but an adjustment for the correla>on in the previous associa>ons with asthma is missing.R7: This is indeed an unresolved and even enigma>c ques>on that we share a keen interest in trying to understand.The correla>on results that the reviewer refers to (Supplementary Fig. 5) showed that the pathogen score was posi>vely associated with the rela>ve abundance of Veillonella, Prevotella and several of their species, but did not further explore this phenomenon.We have now expanded this analysis to compare these rela>onships against the backdrop of the rest of the microbial community, included as a new Supplemental Fig. 6.
Here, we show some quite interes>ng new details.Using two complementary correla>on metrics (spearman + SparCC), we show that not only are Veillonella and Prevotella correlated with the pathogenic genera, they also seem to form a subcommunity structure as we have previously hypothesized.This clustering is shown in the heatmap, which was ordered using hierarchical clustering based on each correla>on metric.Furthermore, we show that Veillonella and Prevotella abundances also are among the taxa strongest correlated with the pathogen score, and by stra>fied analyses that the correla>on for Prevotella is likely underes>mated due to undersampling.We cannot jus>fy making strong claims as to which of these taxonomic en>>es are true vs which may be bystanders on the basis of these observa>onal data, we can merely show and report this interes>ng apparent link between them, which we speculate has relevance for the reported asthma associa>ons and may even be different markers for the same underlying phenomenon.Pertaining to the idea of an adjusted analysis, we have provided the results in a table below, however we are not convinced that such an adjusted analysis can help disentangle the rela>onship between these bacteria, for the following reasons: First, if these bacteria (pathogens + V/P) are indeed representa>ves of the same latent subcommunity, such an adjustment would detract from their joint signal and be an overadjustment.Second, since in the overall analysis the pathogen score was significant and Veillonella/Prevotella was not, there doesn't seem to be a sufficiently strong signal for them to aSempt adjustment.With these caveats in mind, we see in the table that when mutually adjusted, the es>mate for the pathogen score remains unchanged while the es>mate for V/P aSenuates but remains nonsignificant.We do not think these results add sufficient insight to warrant inclusion in the paper, but are of course open to counter-arguments.We have now described this new figure at the end of the Results sec>on.R9: We agree that this is an important point to communicate clearly, and have now added some context for this: "M.catarrhalis and M. lincolnii are two dis>nct species, and to confirm that their 16S rRNA gene sequences are indeed different enough to confidently separate them in our data, we inves>gated their gene>c similarity.We found them dis>nctly iden>fiable in the V3-V4 region used in this study based on 16S rRNA genes in reference genomes (Supplementary Fig. 4)."

95% CI
C10: Lines 218/9: I would also men>on that adjustment for confounders did not produce results substan>ally different from the raw es>mates since adjustment for many measured confounders can amplify residual confounder bias.R10: We have now men>oned this.
C11: Line 304: "nuances" seems inappropriate given the substan>al differences just listed.R11: We agree and have rephrased this: "Despite these differences, …" C12: Unfortunately the main document contains the main text twice but not the supplementary figures.From the responses I understand that there were no major changes in the supplement.Therefore I would not insist on assessing the supplement.R12: We apologize -the supplement was omiSed from the submission by mistake.There were only very minor formaung changes between the two versions (eg."Supplemental fig 1" instead of "Figure S1").Note that in this submission, we have now added Supplemental Fig. 6, as detailed in R7.
I thank the authors for their clarifications.From R5 I understand that the main result of the paper is that the association between pathogen colonization and asthma is "sufficiently described by these three pathogens alone".I appreciate the usage of "sufficiently" (instead of "exclusively") because the study does not have the statistical power to demonstrate that no other pathogens are involved.However, the presentation of Veillonella and Prevotella in Fig. 4c suggests independent effects by those two genera, which is misleading and might be understood as contradicting the main result.
In this context, I appreciate the mutually adjusted analyses presented in R7.Though the presence/absence of Veillonella/Prevotella is correlated with the pathogen score, there is no evidence of collinearity or, to use the authors' words, "overadjustment".Anyway, the change in the estimate is suggestive of confounding by the pathogen score.Therefore I recommend replacing figure 4c by a forest plot showing the raw and mutually adjusted models presented in R7.I do not consider this a contradiction to the earlier analysis presented in Thorsen et al, Nat Comm 2019.
Rather I share the authors' perception that they might have captured the same signal in both cohorts, but in COPSAC 2010 by proxies of the three pathogens.Therefore, I consider the new Supplementary Fig. 6 very helpful; panels a and b might be presented in the main manuscript.
I share the other reviewer's concerns that the study design is not suitable for comparing sampling sources and detection methods.However, I acknowledge that both COPSAC studies were not set up for such comparisons.Therefore, I recommend emphasizing that (1) such comparisons have been covered by other more suited studies and (2) the currently presented analyses are secondary analyses of studies with different original aims.The latter, however, involve the relation of airway pathogens and asthma development, which is in the focus of the current manuscript.
With respect to response R6, I suggest shifting the focus from the difference between the studies towards the common finding of the pathogen cluster.In my opinion, consistency between studies can be demonstrated without explaining the differences between them.Besides, the studies are underpowered to exclude differences.Insofar, the statement "We can establish based on these results that the association between pathogenic bacteria and asthma in COPSAC2000 did not critically depend on sampling location and method nor detection method" is too strong.
The argument in lines 226-229 is not logical as it mixes up the issues of temporality and confounding.Besides I would not consider pharmacological treatment in a cross-sectional study more problematic than the intervention performed in COPSAC2010. Minor: It might be easier for the reader to state in the figure legends which study is the basis of the analyses.
The use of "abundance" in the plural is jargon and might be replaced by "abundance values".

REVIEWERS' COMMENTS
We thank the reviewer for agreeing to assess our revised manuscript and for highligh7ng some relevant points in their assessment.Replies are given to each comment in blue font.
Reviewer #2 (Remarks to the Author): Comment 1: I thank the authors for their clarifica7ons.From R5 I understand that the main result of the paper is that the associa7on between pathogen coloniza7on and asthma is "sufficiently described by these three pathogens alone".I appreciate the usage of "sufficiently" (instead of "exclusively") because the study does not have the sta7s7cal power to demonstrate that no other pathogens are involved.However, the presenta7on of Veillonella and Prevotella in Fig. 4c suggests independent effects by those two genera, which is misleading and might be understood as contradic7ng the main result.
Reply 1: We agree with this point -we cannot rule out smaller effects from other bacteria.Regarding fig 4c, see R2.
C2: In this context, I appreciate the mutually adjusted analyses presented in R7.Though the presence/absence of Veillonella/Prevotella is correlated with the pathogen score, there is no evidence of collinearity or, to use the authors' words, "overadjustment".Anyway, the change in the es7mate is sugges7ve of confounding by the pathogen score.Therefore I recommend replacing figure 4c by a forest plot showing the raw and mutually adjusted models presented in R7.I do not consider this a contradic7on to the earlier analysis presented in Thorsen et al, Nat Comm 2019.Rather I share the authors' percep7on that they might have captured the same signal in both cohorts, but in COPSAC 2010 by proxies of the three pathogens.Therefore, I consider the new Supplementary Fig. 6 very helpful; panels a and b might be presented in the main manuscript.R2: We are grateful for this per7nent comment from the reviewer.As suggested, we have now removed the Kaplan Meier curve (prev.fig 4c) and instead added a forest plot of the crude vs mutually adjusted es7mates.Furthermore, we have added the heatmap (prev.supplementary fig 6A) as a new figure 5. We have kept the prev.panel b in the supplemental figure since it's largely redundant with the former, and therefore more suitable as a supplemental figure.We have now referred to these changes in the results, "Detec&on of either of these two genera was not associated with persistent wheeze/asthma by age 7 years, nor were they differen&ally abundant at the genus level (Fig. 4c-e).Adjus&ng analyses for the same covariates men&oned above did not change the results, but mutually adjus&ng the presence/absence of Veillonella/Prevotella and the pathogen score resulted in aIenua&on of the es&mate for Veillonella/Prevotella but not the pathogen score (Fig 4e)." and "We found that Veillonella and Prevotella were included in an apparent cluster with Streptococcus and Haemophilus (Fig 5 and Supplemental Fig. 6)." C3: I share the other reviewer's concerns that the study design is not suitable for comparing sampling sources and detec7on methods.However, I acknowledge that both COPSAC studies were not set up for such comparisons.Therefore, I recommend emphasizing that (1) such comparisons have been covered by other more suited studies and (2) the currently presented analyses are secondary analyses of studies with different original aims.The lacer, however, involve the rela7on of airway pathogens and asthma development, which is in the focus of the current manuscript.R3: We agree and have now emphasized these aspects in the discussion sec7on: "Importantly, our study was not designed to compare different upper airway sampling loca&ons or detec&on methods, which has been performed elsewhere 20-22 , but rather to examine early life risk factors for disease, in par&cular asthma." C4: With respect to response R6, I suggest shiding the focus from the difference between the studies towards the common finding of the pathogen cluster.In my opinion, consistency between studies can be demonstrated without explaining the differences between them.Besides, the studies are underpowered to exclude differences.Insofar, the statement "We can establish based on these results that the associa7on between pathogenic bacteria and asthma in COPSAC2000 did not cri7cally depend on sampling loca7on and method nor detec7on method" is too strong.R4: We have now modified the sec7on and toned down the wording and shided focus to include this consistency.The sec7on now reads, "We can establish based on these results that the associa&on between pathogenic bacteria and asthma in COPSAC2000 recapitulated the culture-based findings independent of sampling loca&on & method and detec&on method.While we did find overlaps between the taxa represen&ng the pathogenic score and Veillonella and Prevotella, s&ll some unresolved cohort differences exist." C5: The argument in lines 226-229 is not logical as it mixes up the issues of temporality and confounding.Besides I would not consider pharmacological treatment in a cross-sec7onal study more problema7c than the interven7on performed in COPSAC2010.R5: The intent with this comment was comparing the longitudinal design (recrui7ng healthy newborns) with case-control studies of pa7ents vs healthy controls, where there are numerous differences (incl treatment) as part of the study design.However, we agree that this was not formulated in a good way -of course the reviewer is correct that confounding is s7ll an important challenge.Therefore, we have removed the sentence from the manuscript.

Crude (as previously reported in manuscript) HR Lower Upper p-value
For M. lincolnii to be a new candidate since the NEJM 2007 publica>on one should clearly state that the new candidate is not just a product of an updated version of the taxonomic assignment.The wording in lines 173-5 is probably not clear enough.